Quotient Complexity of Closed Languages * 



Janusz Brzozowski^, Galina Jiraskova^, and Chenglong Zou^ 

^ David R. Cheriton School of Computer Science, University of Waterloo, 
Waterloo, ON, Canada N2L 3G1 
{brzozo®, c2zou@student .math. }uwaterloo . ca 
^ Mathematical Institute, Slovak Academy of Science, 
Gresakova 6, 040 01 Kosice, Slovakia 
{ j iraskovOsaske . sk} 



Abstract. A language L is prefix-closed if, whenever a word w is in L, 
then every prefix of w is also in L. We define suffix-, factor-, and subword- 
closed languages in the same way, where by subword we mean subse- 
quence. We study the quotient complexity (usually called state com- 
plexity) of operations on prefix-, suffix-, factor-, and subword-closed lan- 
guages. We find tight upper bounds on the complexity of the prefix-, 
suffix-, factor-, and subword-closure of arbitrary languages, and on the 
complexity of boolean operations, concatenation, star and reversal in 
each of the four classes of closed languages. We show that repeated ap- 
plication of positive closure and complement to a closed language results 
in at most four distinct languages, while Kleene closure and complement 
gives at most eight languages. 

Keywords: automaton, closed, factor, language, prefix, quotient, state 
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1 Introduction 

The state complexity of a regular language L is the number of states in the min- 
imal deterministic finite automaton (dfa) recognizing L. The state complexity of 
an operation f{K, L) (or g{L)) in a subclass C of regular languages is the max- 
imal state complexity of the language f{K,L) (or g{L)), when K and L range 
over all languages in C. For a detailed discussion of general issues of state com- 
plexity see [4,22] and the reference lists in those papers. In 1994 the complexity 
of concatenation, star, left and right quotients, reversal, intersection and union 
in regular languages were examined in detail in [23] . The complexity of opera- 
tions was also considered in several subclasses of regular languages: finite [22], 
unary [18,23], prefix-free [13] and suffix-free [12], and ideal languages [6]. These 
studies show that the complexity can be significantly lower in a subclass than 
in the general case. Here we examine state complexity in the classes of prefix-, 
sufRx-, factor-, and subword-closed regular languages. 
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There are several reasons for considering closed languages. They appear 
often in theoretical computer science. Subword-closed languages were studied 
in 1969 [11], and also in 1973 [20]. SufHx-closed languages were considered in 
1974 [10], and later in [9, 14,21]. Factor-closed languages, also called factorial, 
have received some attention, for example, in [2, 16]. Subword-closed languages 
were studied in [17]. Prefix-closed languages play a role in predictable semiau- 
tomata [7] . All four classes of closed languages were examined in [1] , and decision 
problems for closed languages were studied in [8]. A language is a left ideal (re- 
spectively, right, two-sided, all-sided ideal) if L = E*L, (respectively, L = LE*, 
L = E*LE* and L = E*wL), where i:*Lu L is the shufHe of E* with L). Closed 
languages are related to ideal languages as follows [1]: For every non-empty L, L 
is a right (left, two-sided, all-sided) ideal, if and only if L is a prefix(suffix, factor, 
subword)-closed language. Closed languages are defined by binary relations "is 
a prefix of" (respectively, "is a sufRx of , "is a factor of" , "is a subword of" ) [1] , 
and are special cases of convex languages [1, 20]. The fact that the four classes of 
closed languages are related to each other permits us to obtain many complexity 
results using similar methods. 

2 Quotient Complexity 

If is a non-empty finite alphabet, then E* is the free monoid generated by E. 
A word is any clement of E*, and e is the empty word. The length of a word 
w G E* is A language over E is any subset of E*. The cardinality of a set 
is denoted by ]S'|. 

If w = uxv for some u,v,x G E*. then u is a prefix of w, v is a svffix of 
w, and X is a factor of w. If w = wofliw'i ■ • • o-nWri, where oi, . . . ,an G E, and 
Wq, . . . , Wn G E* , then w = ai • • • a„ is a subword of w. 

A language L is prefix- closed if u> G L implies that every prefix of w is also 
in L. In the same way, we define suffix-, factor-, and subword-closed languages. 
A language is closed if it is prefix-, suffix-, factor-, or subword-closed. 

The following set operations are defined on langTiages: complement {L = 
E*\L), union (KUL), intersection (KClL), difference (K\L), and symmetric 
difference {K ffi L). A general boolean operation with two arguments is denoted 
by KoL. We also define the product, usually called concatenation or catenation, 
{KL = {w E* \ w ^uv,u(^ K,v L}), (Kleene) star {K* = Ui>o^')> and 
positive closure {K'^ = Ui>i ^')- The reverse of a word w €: E* is defined 
as follows; e^' = e, and {wa)^ = aw^. The reverse of a language L is denoted 
by and is defined as = {w^ \ w G L}. 

Regular languages over E are languages that can be obtained from the set of 
basic languages {0, {e}} U {{a} ] a G E}, using a finite number of operations of 
union, product and star. Such languages are usually denoted by regular expres- 
sions, li E is a regular expression, then C{E) is the language denoted by that 
expression. For example, E = {eDa)*b denotes L = C{E) = ({e} U {a})*{6}. We 
usually do not distinguish notationally between regular languages and regular 
expressions; the meaning is clear from the context. 
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A deterministic finite automaton (dfa) is a tuple V = {Q, S, d, qo, F), where 
Q is a set of states, E is the alphabet, S : Q x E ^ Q is the transition function, qo 
is the initial state, and F is the set of final or accepting states. A nondeterministic 
finite automaton (nfa) is a tuple M = [Q, E, rj, Qo, F), where Q, E and F are 
as in a dfa, 77 : Q x 17 — > 2*3 is the transition function and Qo Q Q is the set of 
initial states. If r] also aUows e, i.e., rj : Q x {EU {e}) — » 2'^, we call Af an e-nfa. 

Our approach to quotient complexity follows closely that of [4]. Since state 
complexity is a property of a language, it is more appropriately defined in 
language-theoretic terms. The left quotient, or simply quotient, of a language 
L by a word w is the language L^^ = {x € E* \ wx G L}. The quotient complex- 
ity of L is the number of distinct quotients of L, and is denoted by k(L). 

Quotients of regular languages [3,4] can be computed as follows: First, the 
e -function of a regular language L is = if e ^ L and = e if e G L. 
The quotient by a letter a € E is computed by structural induction: fe^ = if 
6 s {0, e} or 5 e E and b ^ a, and 6a = £ if 6 = a; {L)a = L^, {K U L)a = 
KaULa, {KL)a = KaLUK^La, {K*)a = KaK*. The quoticnt by a word w; e E* 
is computed by induction on the length of w: = L; = La if w = a G. E; 
Lwa = {Lw)a- A quotient is accepting if £ G L^; otherwise it is rejecting. 

The quotient automaton of a regular language L is I> = {Q,E,5,qo,F), 
where Q = {L^ | w G E*}, 6(1^, a) = L^a, qo = Lg = L, and F = {L^, | 
{LyjY = e}. This is the minimal dfa accepting L; hence quotient complexity of 
L is equal to the state complexity of L. However, there are some advantages to 
using quotients [4]. If a language L has the empty quotient, we say that L has 0. 

To simplify the notation, we write {L^y as L^. Whenever convenient, the 
following formulas are used to establish upper bounds on quotient complexity: 

Proposition 1 {[3,4]). If K and L are regular languages, then 

{L)w = L^; {K oL)^ = KyjoLyj. (1) 

{KL)^ = K^L U K^L^ U | |J KlL^ . (2) 

\ ■w = uv j 

(L*), = £ULi*, (L*)^=|^L^U y {L*)lL}jL* forwGE+. (3) 



3 Closure Operations 

We now turn to the closure of languages under binary relations. All the relations 
that we study in this paper arc partial orders. Let < be a partial order on E*; the 
<-closure of a language L is the language = {x G E* \ x < w for some w G 
L}. We use <, ^, C, g for the relations "is a prefix of", "is a suffix of", "is a 
factor of" , "is a subword of" , respectively. 

Suppose L is an arbitrary regular language of complexity n. If n = 1 then L = 
or i = E*, and each closure is L. We show that the worst-case complexity for 
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prefix- closure is n, for suffix-closure it is 2" — 1, and for factor-closure it is 2"~^. 
These bounds are tight for binary languages. Subword-closure of languages was 
previously studied by Okhotin [17] under the name "scattered subwords", but 
tight upper bounds were not established. Our next theorem solves this problem. 

Theorem 1 (Closure Operations). Let L he a regular language with k{L) = 
n > 2. Let <L, ^L, qL, ^L be the prefix- closure, suffix-closure, factor-closure, 
and subword-closure of L, respectively. Then 

1. k{<L) < n. 

2. k{^L) < 2" — 1 if L does not have 0, and k{^L) < 2"~^ otherwise. 

3. < 2"-i. 

4. <2"-2 + l. 

The last bound is tight if\S\ > n — 2; the other hounds are tight if\S\ > 2. 

Proof. 1. Given a language L recognized by dfa to get the dfa for its prefix- 
closure <L, we need only make each non-empty state accepting. Hence k(<L) < 
n. For tightness, consider the language L = {a^ \ i < n — 2}. We have <L = L 
and k{<L) — n. 

2. Having a quotient automaton of a language L, we can construct an nfa 
for its suffix- closure by making each non-empty state initial. The equivalent dfa 
has at most 2'' — 1 states if L does not have the empty quotient (the empty 
set of states cannot be reached), and at most 2"~^ states otherwise. To prove 
tightness, consider the language L defined by the quotient automaton shown in 
Fig. 1. Construct an nfa for the suffix-closure of L, by making all states initial. 
Let us show that the corresponding subset automaton has 2" — 1 reachable and 
pairwise inequivalent states. 




We prove reachability by induction on the size of subsets. The basis, l^l = n, 
holds true since {0, 1, . . . , n — 1} is the initial state. Assume that each set of size 
k is reachable, and let S* be a set of size k — 1. li S contains state but does 
not contain state 1, then it can be reached from the set 5* U {1} of size k by b. If 
S contains both and 1, then there is a state i such that i G S and i + 1 ^ 5. 
Then S can be reached from {s — i mod n | s G 5} by a*. The latter set contains 
and does not contain 1, and so is reachable. If a non-empty S does not contain 
0, then it can be reached from {s — min5 | s e 5}, which contains 0, by a™'"^. 

To prove inequivalence notice that the word a"~* is accepted by the nfa only 
from state i for alH = 0, 1, . . . , n — 1. It turns out that all the states in the subset 
automaton are pairwise inequivalent. 
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Now consider the case where a language has 0. Let L be the language defined 
by the quotient automaton shown in Fig. 2. We first remove state n — 1 and all 
transitions going to this state, and then construct an nfa as above. The proof 
of reachability of all non-empty subsets of{0,l,...,n — 2}is the same as in the 
previous case. The empty set can be reached from {0} by b. For inequivalence, 
(a6)" is accepted only from 0, and a"~^~*(a6)" only from iiov i= 1, 2, . . . , n — 2. 




Fig. 2. Quotient automaton of a language L which has 0. 



3. Suppose we have the quotient automaton of a language L. To find an nfa for 
the factor closure cL, we make all non-empty states of the quotient automaton 
both accepting and initial and delete the empty state. Hence the bound is 2""^^. 
The language L defined by quotient automaton shown in Fig. 2 meets the bound. 

4. To get an e-nfa for the subword-closure from the quotient automaton 
of L, wc remove the empty state (if there is no empty state, then = S*), and 
add an e-transition from state p to state q whenever there is a transition from p 
to q in the quotient automaton. Since the initial state can reach every non-empty 
state through e-transitions, no other subset containing the initial state can be 
reached. Hence there are at most 2"~^ + 1 reachable subsets. 

To prove tightness, if n = 2, let S = {a, 6}; then L = a* meets the bound. If 
n > 3, let 17 = {ai, . . . , a„_2}, and L = Ua ei; '^ii^ \ Wi})*- Thus the language 
L consists of all words over S, in which the first letter occurs exactly once. 
Let K be the subword-closure of L. Then K = L U {w & S* \ at least one 
letter is missing in w}. For each boolean vector b = (&i, 62, • ■ • , &n-2), define the 
word w{b) = W1W2 • ■ •Wn-2, in which = e if 6j = and Wi = if bi = 1. 
Now consider the word £, and each word a\w{b). Let us show that all quotients 
of K by these 2"~^ + 1 words arc distinct. For each binary vector &, we have 
aia2 • ■ ■an-2 € Kg \ ifa,^ „,((,). Let b and b' be two different vectors with bi = 
and 6- = 1. Then we have aia2 • • • 0,1—10,1+10,1+2 ■ ■ ■ 0^—2 € ^aiw(b) \ ■^a\w(b')- 
Thus all quotients are distinct, and so k{K) > 2"~^ -|- 1. □ 

4 Basic Operations on Closed Languages 

Now we study the quotient complexity of operations on closed languages. For 
regular languages, the following bounds are known [23]: mn for boolean opera- 
tions, rn2" — 2"~^ for product, 3/4.2" for star, and 2" for reversal. The bounds 
for closed languages arc smaller in most cases. Wc also show that the bounds 
are tight, usually for a fixed alphabet. The bounds for boolean operations and 
reversal follow from the results on ideal languages [6]. 
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Theorem 2 (Boolean Operations). If K and L are prefix-closed (or factor- 
closed or subword- closed) with k{K) = m and k{L) = n, then 

1. k{K n L) < ran — {m + n — 2), 

2. k{K U L), k{K (BL) < mn, 

3. kIk\L) < mn - (n - 1), 

For suffix-closed languages, k{K o L) < mn. All hounds are tight if \S\ > 4. 

Proof. Recall that the eomplement of a prefix-closed (respectively, suffix-, factor-, 
or subword-closed) language is a right (respectively, left, two-sided, all-sided) 
ideal. We get all the results using De Morgan's laws and the results from [6]. □ 

Remark 1. If L is prefix-closed, then either L = U* or L has as a quotient. 
Moreover, each quotient of L is either accepting or 0. 

Remark 2. For a sufRx-closed language L, if w is a sufBx of w then Lyj C Ly. In 
particular, Lyj C = L for each word w in S*. 

Theorem 3 (Product). Let K and L he closed languages with n{K) = m and 
k{L) = n, and let k be the numher of accepting quotients of K. If m = 1 or 

n = I, then k{KL) = 1. Otherwise, 

1. If K and L are prefix- closed, then k{KL) < (m -I- 1) • 2"~^. 

2. If K and L are suffix- closed, then k{KL) < (m — k)n + k. 

3. If K and L are both factor- or both subword-closed, then k{KL) < m -\- n — 1. 
All hounds are tight if\S\ > 3. 

Proof. If TO = 1, then K = 9 or K = E*, and so KL = or, since e G L, 
KL = S* . Thus k{KL) = 1. The case of n = 1 is similar. Now let m, n > 2. 

1. If if and L are prefix-closed, then e E K, and, by Remark 1, both lan- 
guages have as a quotient. The quotient {KL)^ is given by Equation (2). If Kyj 
is accepting, then L is always in the union, and there are 2"~^ non-empty subsets 
of non-empty quotients of L that can be added. Since there are to — 1 accepting 
quotients of K, there are (m — 1)2"~^ such quotients of KL. If K^j is reject- 
ing, then 2"^^ subsets of non-empty quotients of L can be added. Altogether, 
k{KL) < 2"-i + {m- l)2"-2 = {m+ l)2"-2. 

For tightness, consider prefix-closed languages K and L defined by the quo- 
tient automata of Fig. 3 (if n = 2, then L = {a, c}*). Construct an e-nfa for 
the language KL from these quotient automata by adding an £-transition from 
states qo,qi, ■ . . , qm-2 to state 0. The initial state of the nfa is go, and the accept- 
ing states arc 0, 1, . . . , n — 2. Let us show that there are (m + 1) • 2"~^ reachable 
and pairwise inequivalent states in the corresponding subset automaton. 

State {gojO} is the initial state, and each state {go,0, ^l,^2, • • • ,ife}, where 
1 < ii < i2 < • • ■ < ife < n— 2, can be reached from state {go, 0, ^2— «i, . • • , ik—ii} 
by word ab^^~^. For each subset S* of {0, 1, . . . , n — 2} containing state 0, each 
state {qi} U S with 1 < i < to — 1 can be reached from state {go} U 5 by c*. If a 
non-empty set S does not contain state 0, then state {gm-i} U S can be reached 
from state {qm-i} U {s — min 5 | s € S}, which contains state 0, by a™™'^. State 
{qm-i,n — 1} can be reached from state {qm-i,n — 2} by b. 
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a,b a,b a,b a,b a,b,c 




Fig. 3. Quotient automata of prefix-closed languages K and L. 



To prove inequivalence, notice that the word 6" is accepted by the quotient 
automaton for L only from state 0, and the word o"~^~*6" only from state i 
{1 < i < n — 2). It turns out that two different states {qm-i}^S and {qm-i}^T 
are inequivalent. It follows that states {qi} U S and {qi} U T are inequivalcnt as 
well. States {qi}US and {qj}{JT with i < j can be distinguished by c"^~^^^b^^ab". 
Hence the subset automaton has (m + 1) • 2"~^ reachable and pairwise inequiv- 
alent states, and so k{KL) = (m + 1)2"~^. 

2. If K and L are suffix-closed, then, by Remark 2, for each word w we have 

{KL)^ = K^LUK'L^U{ \J K'^L^) = K^LU L^, 

for some suffix x of w. If is a rejecting quotient, there are at most (m — k)n 

such quotients. If K^^, is accepting, then e G K-^j, and since i/^; ^ — L C K^L. 
we have {KL)yj = K^L. There are at most k such quotients. Therefore there 
are at most (m — k)n + k quotients in total. 

To prove tightness, let K and L be ternary sufhx-closcd languages defined by 
quotient automata shown in Fig. 4. Consider the words e = aPiP , and a^V with 



b,c b b b a,b,c 




Fig. 4. Quotient automata of suffix-closed languages K and L. 
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1 < i < m — 1 and < j <n—l. Let us show that ah quotients of KL by these 
words are distinct. Let ^ {k, £), and let x = a^V and y = a^b^. If i < fc, take 
z = a'^~^~''b'^c. Then xz is in KL, while yz is not, and so z G {KL)^ \ {KL)y. 
Hi = k and j < i, take z = a^b'^-^-^c. We again have z e (KL)^ \ {KL)y. 
Thus the language KL has at least (m — l)n + 1 distinct quotients, and so 
k{KL) = (m - l)n+ 1. 

Notice that, if the quotients K^i with < i < A: — 1 are accepting, then the 
resulting product has quotient complexity (m — k)n + k. 

3. It suffices to derive the bound for factor-closed languages, since every 
subword-closed language is also factor-closed. Since factor-closed languages are 
sufRx-closed, k{KL) < (m — k)n + k. The language K has at most one rejecting 
quotient, because it is prefix-closed. Thus, k = m — 1 and k{KL) < m -\- n — \. 

For tightness, consider binary subword-closed languages K = {w & {a^b}* \ 
Qm-i jg jjQ^ ^ subword of w} and L = {w € {a,b}* \ is not a subword of 
w} with k{K) = m and k{L) = n. Consider the word w = a"*~^6"~^. This word 
is not in the product KL. However, removing any non-empty subword from w 
results in a word in KL. Therefore, k{KL) > m + n — 1. □ 

Theorem 4 (Star). Let L he a closed language with k{L) = n > 2. 
L If L is prefix-closed, then k{L*) < 2"~^ -|- 1. 

2. If L is suffix- closed, then k{L*) <n if L = L* and k{L*) <n — 1 if L L* . 

S. If L is factor- or subword-closed, then k(L*) < 2. 

If k(L) = 1, then k{L*) < 2. All bounds are tight if \S\ > 2. 

Proof. 1. For every non-empty word w, the quotient (L*),^ is given by Equa- 
tion (3). If L is prefix-closed, then so is L* and {L*)^. Thus, if (i*)^, is non- 
empty, then it must contain the empty word. Hence ^ L* D LL* D L. 
Since the empty quotient of L and L itself are always contained in every non- 
empty quotient of L* , there are at most 2"~^ non-empty quotients of L* . Since 
there is at most one empty quotient, there are at most 2"~^ -|- 1 quotients in 
total. The quotient {L*)e has already been counted, since L is closed and e G L 
implies {L*)e = LL* , which has the form of Equation (3). 

If n = 1 and n = 2, the bound 2 is met by L = and L = s, respectively. 
Now let n > 3 and let L be the prefix-closed language defined by the quotient 
automaton shown in Fig. 5; transitions not depicted in the figure go to state n—1. 
Construct an £-nfa for L* by removing state n — 1 and adding an e-transition 
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from all the remaining states to the initial state. Let us show that 2"~^ + 1 states 
are reachable and pairwisc inequivalent in the corresponding subset automaton. 

We first prove that each subset of {0, 1, . . . , n — 2} containing state is 
reachable. The proof is by induction on the size of the subsets. The basis, 15*1 = 1, 
holds true since {0} is the initial state of the subset automaton. Assume that 
each set of size k containing state is reachable, and let S = {0,H,i2, ■ ■ ■ ,ife}, 
where < ii < 12 < • • • < ife < n — 2, be a set of size k + 1. Then S can be 
reached from the set {0, i2 — ii, ■ ■ ■ ■ ik ^ h} of size k by aV-^^^ . Since the latter 
set is reachable by the induction hypothesis, the set S is reachable as well. The 
empty set can be reached from {0} by fe, and we have 2"~^ + 1 reachable states. 

To prove inequivalence of these states notice that the word 6"^'^ is accepted 
by the nfa only from state 1, and each word 6"~^~'c&"~^ (2 < i < n — 2), only 
from state i. It follows that all the states in the subset automaton are pairwise 
inequivalent. 

2. For a non-empty sufHx-closed language L, the quotient (i*)e is LL* , which 
is of the same form as the quotients by a non-empty word w given by Equa- 
tion (3), {L*)^ = [L^ U U • • • U Lyi,)L*, where the Vi are suffixes of w, and 
Vk is the shortest. By Remark 2, if w is a suffix of w, then Lyj C Ly. Thus the 
quotient becomes (L*)^ = Ly^L* . There are at most n such quotients. 

li L ^ L* for a non-empty suffix-closed language L, then there must be two 
words x,y in L such that xy ^ L. Hence y G Lg\ L^, and so ^ L^- However, 
since s € and L* is sufRx-closed, we have {L*)^ = L* C L^L* C {L*)x C 
(L*)e, and so (L*)^ = {L*)^. It turns out that k{L*) <n-l. 

For n = 1, L = and for n = 2, L = e meet the bound 2. Let n > 3. If L = 
(o U 6a"-2)*, then L is suffix-closed, k{L) = n, and L* = L.lf L = eU [j"^^ a'b, 
then L is suffix-closed, k{L) = n, L* = {[J^Zq a'-h)*, and k{L*) — n — 1. 

3. If each letter in U appears in some word of a factor-closed language L, 
then L* = S* and k{L*) = 1. Otherwise, k{L*) = 2. The bound is met by 
subword-closed language L = {w £ {a, b}* | w = a* and < i < n — 2}. □ 

Since the operation of reversal commutes with complementation, we have the 
following results on ideal languages from [6]: 

Theorem 5 (Reversal). Let L be a closed language with k{L) = n > 2. 

1. If L is prefix-closed, then k{L^) < 2"~^. The bound is tight if \S\ > 2. 

2. IfL is suffix-closed, then k{L^) < 2"-^ -|- 1. The bound is tight if > 3. 

3. IfL is factor-closed, then k{L^) < 2"-^ + 1. The bound is tight if \E\ > 3. 
4- If L is subword-closed, then k{L^) < 2"~^-|-l. The bound is tight if\S\ > 2n. 
IfK{L) = l,thenK{L^) = l. □ 

Unary Languages: Unary closed languages have special properties because 

the product of unary languages is commutative. The classes of prefix-closed, 
suffix-closed, factor-closed, and subword-closed unary languages all coincide. If 
a unary closed language L is finite, then either it is empty and has k{L) = 1, 
or has the form {a* \ i < n — 2}, for some n > 2, and has k{L) ~ n. li L is 
infinite, then L = a*, and k{L) = 1. The bounds for unary languages are given 
in Tables 1 and 2 on page 11. 
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5 Kuratowski Algebras Generated by Closed Regular 
Languages 

A theorem of Kuratowski [15] states that, given a topological space, at most 14 
distinct sets can be produced by repeatedly applying the operations of closure 
and complement to a given set. A closure operation on a set S is an operation 
□ : 2^ ^ 2^ satisfying the following conditions for any subsets X,Y of S: 
(1) X C (2) X CY implies X° C (3) C 

Kuratowski's theorem was studied in the setting of formal languages in [5]. 
Positive closure and Kleene closure (star) are both closure operations. It was 
shown in [5] that at most 10 distinct languages can be produced by repeatedly 
applying the operations of positive closure and complement to a given language, 
and at most 14 distinct languages can be produced with Kleene closure instead 
of positive closure. We consider here the case where the given language is closed 
and regular, and give upper bounds for the complexity of the resulting languages. 
Here we denote the complement of a language L by L~. Moreover, the positive 
closure of the complement of L is denoted by L ^, etc. 

We begin with positive closure. Let L be a <-closed language not equal to 
S*. Then L~ is an ideal, and L = L~. In addition, L+ is also <-closed, so 
L+ = L"' . Hence there are at most 4 distinct languages that can be produced 
with positive closure and complementation. 

Theorem 6. The worst- case complexities in every 4-element algebra generated 
by a closed language L with k{L) = n under positive closure and complement 
are: k{L) = k{L~) = n, k{L~^) = k{L~^~) = f{n), where f{n) is: 2"~^ + 1 for 
prefix-closed languages, n — 1 for suffix-closed languages, and 2 for factor- and 
subword-closed languages. There exist closed languages that meet these bounds. 

Proof. Since i+ = L* for a non-empty closed language we have k{L~^) = k{L*), 
and the upper bounds f{n) follow from our results on the quotient complexity of 
star operation; in the case of suffix-closed languages, to get a 4-element algebra 
we need L ^ L*. All the languages that we have used in Theorem 4 to prove 
tighness can be used as examples meeting the bound /(n). □ 

The case of Kleene closure is similar. Let be a <-closed language such that 
L ^ {0,17*}. Then L~ is an ideal and L~ does not contain e. Thus L~* = 
L~ U e and L~*~ = L\e, which gives at most four languages thus far. Now 
L* = {L\ s)*, and L* is also <-closed. By the previous reasoning, we have at 
most four additional languages, giving a total of eight languages as the upper 
bound. The 8-element algebras are of the form (L, L~, L~* = L~ Ue, L~*~ = 
L\e, L*, L*-, L*-* = L*- U e, L*-*- = L* \ e). 

Theorem 7. The worst-case complexities in every 8-element algebra generated 
by a closed language L with k{L) = n under Kleene closure and complement are: 
k{L) = k{L-) = n, k{L*) = k{L*-) = f{n), k{L*-*) = k{L*-*-) = f(n) + 1, 
k{L~*) = k{L~*~) = n + 1, where f{n) is: 2"~^ + 1 for prefix-closed languages, 
n — 1 for suffix-closed languages, and 2 for factor-and subword-closed languages, 
Moreover, there exist closed languages that meet these bounds. 
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Proof. Since L~*~ = L\e and L*~*~ = L* \e we have k{L^*^) < n + 1 and 
k{L*~*~) < f{n) + 1. In the case of sufRx-closed languages, since L must be 
distinct from L*, we have /(n) = n — 1 by Theorem 4. 

1. Let L be the prefix-closed language defined by the quotient automaton in 
Fig. 5 on page 8; then L meets the upper bound on star. Add a loop with a 
new letter d in each state and denote the resulting language by K. Then K is 
a prefix-closed language with k{K) — n and k(K \ e) = n + 1. Next we have 
k{K*) = k{L*) = 2"-2 + 1 and k{K* \e) = 2"-^ + 2. 

2. Let L — b* UlJ"^]^ b*a^b. Then L is a suffix-closed language with k{L) = n 
and k{L \ e) = n + 1. Next, k{L*) = n — 1, and k{L* \ e) = n. 

3. Let L = {w G {a,b,c}* \ w = b*a'- and < i < n - 2}. Then L is a 
subword-closcd language with k{L) = n and k{L\£) = n + 1. Next L* = {a, b}*, 
and so k{L*) = 2 and k{L* \ e) = 3. □ 

6 Conclusions 

Tables 1 and 2 summarize our complexity results. The complexities for regular 
languages are from [23], except those for difference and symmetric difference, 
which are from [4]. The bounds for boolean operations and reversal of closed 
languages are direct consequences of the results in [6]. In Table 2, fc is the number 
of accepting quotients of K. 





KUL 


KnL 


K\L 


K®L 


unary closed 


max{m, n) 


max{m, n) 


m 


max{m, n) 


<-, (E-closed 


mn 


mn — (m + n — 2) 


mn — (n — 1) 


mn 


:<-closed 


mn 


mn 


mn 


mn 


regular 


mn 


mn 


mn 


mn 



Table 1. Bounds on quotient complexity of boolean operations. 





<L 


KL 


K' 




unary closed 


n 


m + n — 2 


2 


n 


<- closed 


n 


m2"-^ 


2"-^ + 1 




C-closed 




m + n — 1 


2 


2"-^ + 1 


(£- closed 


2"-^ + 1 


m + n — 1 


2 


2"-^ + 1 


^-closed 


2" - 1 


(m — k)n + k 


n 


2"--^ + 1 


regular 




m2" - A;2"-' 


2^—1 _|_ 2^~fe~i 


2" 



Table 2. Bounds on quotient complexity of closure, product, star and reversal. 
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